Don't create Host instances with random host_id #623

sylwiaszunejko · 2025-12-18T13:21:21Z

This PR fixes inefficiencies in the host initialization mechanism when bootstrapping a cluster.

Previously, the driver created Host instances with connections from the contact points provided in the cluster configuration using random host IDs. After establishing the control connection and reading from system.peers, these initial Host instances were discarded and replaced with new ones created using the correct host metadata. This approach resulted in unnecessary creation and teardown of multiple connections.

Changes

The control connection is now initialized only using the endpoints specified in the cluster configuration.
After a successful control connection is established, the driver reads from system.local and system.peers.
Based on this metadata, Host instances are created with the correct host_id values.
Connections are then initialized directly on these properly constructed Host instances.

Fixes: #619

sylwiaszunejko · 2025-12-18T13:21:56Z

Some tests are still failing, but I wanted to ask if the direction is good @dkropachev

sylwiaszunejko · 2025-12-18T14:57:06Z

@Lorak-mmk maybe you know, why this test assumes that the new_host should be different?

def test_get_control_connection_host(self):
        """
        Test to validate Cluster.get_control_connection_host() metadata

        @since 3.5.0
        @jira_ticket PYTHON-583
        @expected_result the control connection metadata should accurately reflect cluster state.

        @test_category metadata
        """

        host = self.cluster.get_control_connection_host()
        assert host == None

        self.session = self.cluster.connect()
        cc_host = self.cluster.control_connection._connection.host

        host = self.cluster.get_control_connection_host()
        assert host.address == cc_host
        assert host.is_up == True

        # reconnect and make sure that the new host is reflected correctly
        self.cluster.control_connection._reconnect()
        new_host = self.cluster.get_control_connection_host()
        assert host != new_host

Lorak-mmk · 2025-12-18T15:12:17Z

I have no idea.
In Rust Driver we have logic that if CC breaks then we try to connect it to all other hosts (because the one it was connected to is presumed non-working for now).
I see no such logic in Python Driver. This part was added in commit 2796ee5:

Was this test passing until now and non-flaky? If so, then perhaps there is such logic somewhere.

Lorak-mmk · 2025-12-18T15:13:04Z

Now that I think of it: I see that driver uses LBP to decide order of hosts to connect. See _connect_host_in_lbp and _reconnect_internal.
LBP uses by default is Round Robin, so on reconnect it will start from a different host than at the beginning, right? It would explain why each CC reconnect should land at different host in healthy cluster.

sylwiaszunejko · 2025-12-18T15:21:00Z

Now that I think of it: I see that driver uses LBP to decide order of hosts to connect. See _connect_host_in_lbp and _reconnect_internal. LBP uses by default is Round Robin, so on reconnect it will start from a different host than at the beginning, right? It would explain why each CC reconnect should land at different host in healthy cluster.

Makes sense, second question: in this test:

def test_profile_lb_swap(self):
        """
        Tests that profile load balancing policies are not shared

        Creates two LBP, runs a few queries, and validates that each LBP is execised
        seperately between EP's

        @since 3.5
        @jira_ticket PYTHON-569
        @expected_result LBP should not be shared.

        @test_category config_profiles
        """
        query = "select release_version from system.local where key='local'"
        rr1 = ExecutionProfile(load_balancing_policy=RoundRobinPolicy())
        rr2 = ExecutionProfile(load_balancing_policy=RoundRobinPolicy())
        exec_profiles = {'rr1': rr1, 'rr2': rr2}
        with TestCluster(execution_profiles=exec_profiles) as cluster:
            session = cluster.connect(wait_for_all_pools=True)

            # default is DCA RR for all hosts
            expected_hosts = set(cluster.metadata.all_hosts())
            rr1_queried_hosts = set()
            rr2_queried_hosts = set()

            rs = session.execute(query, execution_profile='rr1')
            rr1_queried_hosts.add(rs.response_future._current_host)
            rs = session.execute(query, execution_profile='rr2')
            rr2_queried_hosts.add(rs.response_future._current_host)

            assert rr2_queried_hosts == rr1_queried_hosts

in this tests it is assumed that both queries should use the same host, as they use different instances of RoundRobinPolicy and they start from the same host? But how this can be true if the position when we start is randomized here: https://github.com/scylladb/python-driver/blob/master/cassandra/policies.py#L182

Lorak-mmk · 2025-12-18T15:38:37Z

No idea. Perhaps populate is not called for those policies for some reason, and they are populated using on_up/down etc?
Try to print a log / stacktrace in populate and run this test.

cassandra/cluster.py

dkropachev · 2025-12-19T13:07:05Z

cassandra/policies.py

+        if not self.local_dc:
+            self.local_dc = dc
+            return HostDistance.LOCAL


Should not be in this PR

@sylwiaszunejko, what is the reason for having it here ?

+1, it is not obvious, nor explained anywhere.

@sylwiaszunejko , it looks like you reintroduced it in recent push.

this is actually needed for any test to pass, now the distance is called before on_add/up in add_or_renew_pool and we need local_dc to have not null value there, I agree it wasn't explained enough, if it is None all Hosts are marked as ignored

@Lorak-mmk @dkropachev
Ok so my findings are:
right now the flow is like this:

we try to establish cc

get system.local and system.peers results

call on_add on cluster for every discovered host https://github.com/scylladb/python-driver/blob/master/cassandra/cluster.py#L2013

it calls distance and on_add on lbp (at this point populate was not called yet as we don't know any hosts at the beginning, so _endpoints on dc aware policy are not set, and we cannot set local_dc in on_add) https://github.com/scylladb/python-driver/blob/master/cassandra/policies.py#L254-L263

tests fails as all hosts are consider IGNORED (empty local_dc)

before my change we would call populate with fake hosts (with proper endpoints but wrong hosts_ids), and _endpoints in dc aware policy would be correctly assigned and usable in on_add when we discover proper hosts (with right host_ids).

To solve this I think we should omit the _endpoints logic (setting it to cluster.resolved_endpoints in populate and checking if the host endpoint is in _endpoints before setting local_dc to default in on_up) and just setting local_dc to the host.datacenter regardless if it is or not in contact points provided to the cluster. WDYT?

There could be use case when users can relay on random dc/rack assignment that comes from DNS.
Say you have a single dns for whole cluster then you target driver to it not specifying dc/rack.

Let's change _refresh_node_list_and_token_map to specifically find and proccess a row that matches endpoint on the connection it is running on.

Then it will endup in distance where policy can learn dc or rack from.
Please don't forget to make same changes to RackAwareRoundRobinPolicy

When implementing RackAwareRoundRobinPolicy it was decided not to do implicit dc / rack #332
Specific comment: #332 (comment)

@dkropachev I am not sure I get your approach, isn't querying system.local on that connection and processing host info from it enough? Or you mean I should proccess it as a first host before peers result?
I thought we agreed that the distance is not a good place to assign default dc.
Do you agree with removing _endpoints logic from dc aware policy?

I pushed new version with local host at the beginning of the list of hosts to proccess

cassandra/cluster.py

sylwiaszunejko · 2025-12-19T14:34:02Z

Now that I think of it: I see that driver uses LBP to decide order of hosts to connect. See _connect_host_in_lbp and _reconnect_internal. LBP uses by default is Round Robin, so on reconnect it will start from a different host than at the beginning, right? It would explain why each CC reconnect should land at different host in healthy cluster.

Makes sense, second question: in this test:
def test_profile_lb_swap(self):
        """
        Tests that profile load balancing policies are not shared

        Creates two LBP, runs a few queries, and validates that each LBP is execised
        seperately between EP's

        @since 3.5
        @jira_ticket PYTHON-569
        @expected_result LBP should not be shared.

        @test_category config_profiles
        """
        query = "select release_version from system.local where key='local'"
        rr1 = ExecutionProfile(load_balancing_policy=RoundRobinPolicy())
        rr2 = ExecutionProfile(load_balancing_policy=RoundRobinPolicy())
        exec_profiles = {'rr1': rr1, 'rr2': rr2}
        with TestCluster(execution_profiles=exec_profiles) as cluster:
            session = cluster.connect(wait_for_all_pools=True)

            # default is DCA RR for all hosts
            expected_hosts = set(cluster.metadata.all_hosts())
            rr1_queried_hosts = set()
            rr2_queried_hosts = set()

            rs = session.execute(query, execution_profile='rr1')
            rr1_queried_hosts.add(rs.response_future._current_host)
            rs = session.execute(query, execution_profile='rr2')
            rr2_queried_hosts.add(rs.response_future._current_host)

            assert rr2_queried_hosts == rr1_queried_hosts
in this tests it is assumed that both queries should use the same host, as they use different instances of RoundRobinPolicy and they start from the same host? But how this can be true if the position when we start is randomized here: https://github.com/scylladb/python-driver/blob/master/cassandra/policies.py#L182

This test was working because populate was called before cc was created, so we only knew about contact points provided in cluster config (so only one host) I believe current approach (calling populate on lbp after creating cc so we can update lbp with all known hosts) is much better so we should remove this test @Lorak-mmk WDYT?

Lorak-mmk · 2025-12-19T15:13:24Z

In the previous approach (calling populate with one host) were the on_add calls correct (so one call for each host, besides CC host)?
If so, then both versions are correct. I think we could then switch to proposed version.

Lorak-mmk · 2025-12-19T15:13:50Z

You could then adjust the test, not remove it.

sylwiaszunejko · 2025-12-19T15:34:14Z

In the previous approach (calling populate with one host) were the on_add calls correct (so one call for each host, besides CC host)? If so, then both versions are correct. I think we could then switch to proposed version.

on_add is called properly, but if there is only one host during populate the starting position for RoundRobinPolicy is always the same even if some hosts are added later:

if len(hosts) > 1:
            self._position = randint(0, len(hosts) - 1)

cassandra/cluster.py

Lorak-mmk · 2025-12-29T12:40:59Z

Please let me review before merging,

Lorak-mmk · 2025-12-29T12:58:13Z

tests/integration/standard/test_query.py

        try:
-            host = [live_hosts[self.host_index_to_use]]
+            if len(live_hosts) > self.host_index_to_use:
+                host = [live_hosts[self.host_index_to_use]]
        except IndexError as e:
            raise IndexError(
                'You specified an index larger than the number of hosts. Total hosts: {}. Index specified: {}'.format(
                    len(live_hosts), self.host_index_to_use
                )) from e
        return host


Previously index error (happening if len(live_hosts) <= host_index_to_use) was caught, error was printed, and then exception rethrown (presumably failing the test).
Now you introduced an if which prevents IndexError from happening at all.

If this change really is desirable, the code handling IndexError should be removed - it is dead.

Please explain reason for this change. Why this condition should now return empty plan instead of exception?

my bad, I should have checked if len(live_hosts) is not 0 here

The condition may have changed, but I still don't understand why its necessary. Why this specific case should return empty plan, instead of throwing an exception?

Now we initialize lbp only after we learn all the hosts from cc (not like before where lbp was populated with values from cluster config), during establishing cc I specifically handled the case that lbp returns empty query plan and we use resolved enpoints

tests/integration/standard/test_policies.py

Lorak-mmk · 2025-12-29T13:07:11Z

tests/integration/standard/test_metrics.py

+            with pytest.raises((WriteTimeout, Unavailable)):
                self.session.execute(query, timeout=None)
        finally:
            get_node(1).resume()

        # Change the scales stats_name of the cluster2
        cluster2.metrics.set_stats_name('cluster2-metrics')

        stats_cluster1 = self.cluster.metrics.get_stats()
        stats_cluster2 = cluster2.metrics.get_stats()

        # Test direct access to stats
-        assert 1 == self.cluster.metrics.stats.write_timeouts
+        assert (1 == self.cluster.metrics.stats.write_timeouts or 1 == self.cluster.metrics.stats.unavailables)
        assert 0 == cluster2.metrics.stats.write_timeouts


Why did the exception thrown change?

tests/integration/standard/test_control_connection.py

tests/integration/standard/test_cluster.py

Lorak-mmk · 2025-12-29T13:26:52Z

cassandra/policies.py

+        if not self.local_dc:
+            self.local_dc = dc
+            return HostDistance.LOCAL


+1, it is not obvious, nor explained anywhere.

cassandra/metadata.py

cassandra/cluster.py

sylwiaszunejko · 2025-12-30T10:40:07Z

@Lorak-mmk I haven't yet figured out why in test_metrics_per_cluster session.execute sometimes throws cassandra.Unavailable: Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level for cl ALL. Requires 3, alive 2" info={'consistency': 'ALL', 'required_replicas': 3, 'alive_replicas': 2} after one of three nodes is paused, but the rest is addressed

Lorak-mmk

Commit Don't create Host instances with random host_id should be the last one, right? Without test fixes introduced in subsequent commit, this commit can't pass tests I think.

Lorak-mmk · 2025-12-30T12:14:20Z

tests/integration/standard/test_query.py

        try:
-            host = [live_hosts[self.host_index_to_use]]
+            if len(live_hosts) > self.host_index_to_use:
+                host = [live_hosts[self.host_index_to_use]]
        except IndexError as e:
            raise IndexError(
                'You specified an index larger than the number of hosts. Total hosts: {}. Index specified: {}'.format(
                    len(live_hosts), self.host_index_to_use
                )) from e
        return host


The condition may have changed, but I still don't understand why its necessary. Why this specific case should return empty plan, instead of throwing an exception?

cassandra/cluster.py

cassandra/metadata.py

Lorak-mmk · 2025-12-30T12:41:28Z

cassandra/pool.py

        self.conviction_policy = conviction_policy_factory(self)
        if not host_id:
-            host_id = uuid.uuid4()
+            raise ValueError("host_id may not be None")
        self.host_id = host_id


Commit: "Don't create Host instances with random host_id"

The change here is the one that the commit message explains. Perhaps the chain((host.endpoint for host in lbp.make_query_plan()), self._cluster.endpoints_resolved) line is also explained. Other changes are not explained, and are not at all obvious to me.

When writing commits, please assume that a reader won't be as familiar with the relevant code as you are. It is almost always true - even if reviewer is an active maintainer, there is high chance they did not work with this specific area recently.

… starting point The `test_profile_lb_swap` test logic assumed that `populate` was called before control connection (cc) was created, meaning only the contact points from the cluster configuration were known (a single host). Due to that the starting point was not random. This commit updates the test to reflect the new behavior, where `populate` is called on the load-balancing policy after the control connection is created. This allows the policy to be updated with all known hosts and ensures the starting point is properly randomized.

Previously, the driver relied on the load-balancing policy (LBP) to determine the order of hosts to connect to. Since the default LBP is Round Robin, each reconnection would start from a different host. After removing fake hosts with random IDs at startup, this behavior changed. When the LBP is not yet initialized, the driver now uses the endpoints provided by the control connection (CC), so there is no guarantee that different hosts will be selected on reconnection. This change updates the test logic to first establish a connection and initialize the LBP, and only then verify that two subsequent reconnections land on different hosts in a healthy cluster.

Only compare hosts endpoints not whole Host instances as we don't know hosts ids.

…ive hosts

… local_dc

In DC aware lbp when local_dc is not provided we set it in on_add and it needs to be initialized for distance to give proper results.

Previously, we used endpoints provided to the cluster to create Host instances with random host_ids in order to populate the LBP before the ControlConnection was established. This logic led to creating many connections that were opened and then quickly closed, because once we learned the correct host_ids from system.peers, we removed the old Hosts with random IDs and created new ones with the proper host_ids. This commit introduces a new approach. To establish the ControlConnection, we now use only the resolved contact points from the cluster configuration. Only after a successful connection do we populate Host information in the LBP. If the LBP is already initialized during ControlConnection reconnection, we reuse the existing values.

sylwiaszunejko requested a review from dkropachev December 18, 2025 13:21

sylwiaszunejko force-pushed the remove_random_ids branch from 7f061e1 to ef382b9 Compare December 18, 2025 13:41

sylwiaszunejko force-pushed the remove_random_ids branch from ef382b9 to 9598dd5 Compare December 18, 2025 14:58

sylwiaszunejko force-pushed the remove_random_ids branch from 9598dd5 to 9e162dd Compare December 19, 2025 13:20

dkropachev requested changes Dec 19, 2025

View reviewed changes

sylwiaszunejko force-pushed the remove_random_ids branch 2 times, most recently from adddec1 to 3e864fc Compare December 20, 2025 12:58

sylwiaszunejko requested a review from dkropachev December 22, 2025 09:57

sylwiaszunejko force-pushed the remove_random_ids branch from 3e864fc to 0a1aa0e Compare December 22, 2025 12:59

sylwiaszunejko self-assigned this Dec 22, 2025

sylwiaszunejko requested a review from Lorak-mmk December 22, 2025 13:21

sylwiaszunejko marked this pull request as ready for review December 22, 2025 13:21

sylwiaszunejko force-pushed the remove_random_ids branch from dd1eb6f to a6cf3aa Compare December 22, 2025 13:56

dkropachev requested changes Dec 23, 2025

View reviewed changes

cassandra/cluster.py Outdated Show resolved Hide resolved

sylwiaszunejko force-pushed the remove_random_ids branch from a6cf3aa to fef57ae Compare December 29, 2025 11:09

sylwiaszunejko requested a review from dkropachev December 29, 2025 11:10

dkropachev reviewed Dec 29, 2025

View reviewed changes

cassandra/cluster.py Outdated Show resolved Hide resolved

sylwiaszunejko force-pushed the remove_random_ids branch from fef57ae to 8124928 Compare December 29, 2025 12:22

dkropachev approved these changes Dec 29, 2025

View reviewed changes

Lorak-mmk requested changes Dec 29, 2025

View reviewed changes

sylwiaszunejko force-pushed the remove_random_ids branch from 8124928 to c381c19 Compare December 29, 2025 15:37

sylwiaszunejko requested a review from Lorak-mmk December 30, 2025 10:40

Lorak-mmk requested changes Dec 30, 2025

View reviewed changes

sylwiaszunejko force-pushed the remove_random_ids branch from 2d17c1b to 14f78b5 Compare January 8, 2026 12:29

sylwiaszunejko requested a review from Lorak-mmk January 8, 2026 12:32

sylwiaszunejko force-pushed the remove_random_ids branch 3 times, most recently from 7b7cf1f to 02acb4c Compare January 9, 2026 13:44

sylwiaszunejko added 9 commits January 9, 2026 14:45

Use endpoint instead od Host in _try_connect

cdafe90

tests/integration/standard: don't compare Host instances

f63977f

Only compare hosts endpoints not whole Host instances as we don't know hosts ids.

tests/unit: Provide host_id when initializing Host

5aed786

tests/integration/standard: return empty query plan if there are no l…

9f4334f

…ive hosts

Don't check if host is in initial contact points when setting default…

eca72ec

… local_dc

Call on_add before distance to properly initialize lbp

46edc8f

In DC aware lbp when local_dc is not provided we set it in on_add and it needs to be initialized for distance to give proper results.

sylwiaszunejko force-pushed the remove_random_ids branch from 02acb4c to fff9753 Compare January 9, 2026 13:46

Don't create Host instances with random host_id #623

Are you sure you want to change the base?

Don't create Host instances with random host_id #623

Uh oh!

Conversation

sylwiaszunejko commented Dec 18, 2025 • edited by dkropachev Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

sylwiaszunejko commented Dec 18, 2025

Uh oh!

sylwiaszunejko commented Dec 18, 2025

Uh oh!

Lorak-mmk commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Lorak-mmk commented Dec 18, 2025

Uh oh!

sylwiaszunejko commented Dec 18, 2025

Uh oh!

Lorak-mmk commented Dec 18, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkropachev Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sylwiaszunejko Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sylwiaszunejko commented Dec 19, 2025

Uh oh!

Lorak-mmk commented Dec 19, 2025

Uh oh!

Lorak-mmk commented Dec 19, 2025

Uh oh!

sylwiaszunejko commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Lorak-mmk commented Dec 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sylwiaszunejko commented Dec 18, 2025 •

edited by dkropachev

Loading

Lorak-mmk commented Dec 18, 2025 •

edited

Loading

dkropachev Jan 9, 2026 •

edited

Loading

sylwiaszunejko Jan 9, 2026 •

edited

Loading

sylwiaszunejko commented Dec 19, 2025 •

edited

Loading

sylwiaszunejko commented Dec 30, 2025 •

edited

Loading